Parsimonious additive models
نویسندگان
چکیده
We present a new method for function estimation and variable selection, specifically designed for additive models fitted by cubic splines. Our method involves regularizing additive models using the l1–norm, which generalizes Tibshirani’s lasso to the nonparametric setting. As in the linear case, it shrinks coefficients, some of them reducing exactly to zero. It gives parsimonious models, select significant variables, and reveal nonlinearities in the effects of predictors. Two strategies for finding a parsimonious additive model solutions are proposed. Both algorithms are based on a fixed point algorithm, combined with a singular value decomposition that considerably reduces computation. The empirical behavior of parsimonious additive models is compared to the adaptive backfitting BRUTO algorithm. The results allow us to characterise the domains in which our approach is effective: it performs significantly better than BRUTO when model estimation is challenging. An implementation of this method is illustrated using real data from the Cophar 1 ANRS 102 trial. Parsimonious additive models are applied to predict the indinavir plasma concentration in HIV patients. Results suggest that our method is a promising technique for the research and application areas.
منابع مشابه
Parsimonious classification via generalised linear mixed models
We devise a classification algorithm based on generalised linear mixed model (GLMM) technology. The algorithm incorporates spline smoothing, additive model-type structures and model selection. For reasons of speed we employ the Laplace approximation, rather than Monte Carlo methods. Tests on real and simulated data show the algorithm to have good classification performance. Moreover, the result...
متن کاملDetermination of the genetic and non-genetic variations in growth curve of Zandi lambs by random regression models
The aim of this study was to model the variances and covariances of body weight in Zandi sheep from 60 to 365 days of age using random regression models (RRM). Legendre polynomials of different orders were used to model the direct and maternal covariances. Mean trends were also modeled through a quadratic regression on orthogonal polynomials of age. Homogeneity and heterogeneity of the residual...
متن کاملLifetimegenetic analysis of milk yield in Iranian Holstein cows using repeatability and pre-structured multivariate models
Milk yield records from 1st to 5th lactations of Iranian Holstein cows were analyzed using repeatability and a number of multivariate models that varied in additive genetic variance structure. A total of313,006 milk yield records were used. The records were obtained from 116,531 cows born between 2001 and 2005. The animals originated from 2,355 sires and 91,212 dams. A multivariate model with h...
متن کاملGeneralized additive distributed lag models: quantifying mortality displacement.
There are a number of applied settings where a response is measured repeatedly over time, and the impact of a stimulus at one time is distributed over several subsequent response measures. In the motivating application the stimulus is an air pollutant such as airborne particulate matter and the response is mortality. However, several other variables (e.g. daily temperature) impact the response ...
متن کاملPath consistent model selection in additive risk model via Lasso.
As a flexible alternative to the Cox model, the additive risk model assumes that the hazard function is the sum of the baseline hazard and a regression function of covariates. For right censored survival data when variable selection is needed along with model estimation, we propose a path consistent model selector using a modified Lasso approach, under the additive risk model assumption. We sho...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational Statistics & Data Analysis
دوره 51 شماره
صفحات -
تاریخ انتشار 2007